Introduction

The WorldBank(R) provides open data for hundreds of indicators. These indicators are available for most countries spanning many years. You can learn more at https://datacatalog.worldbank.org/

The WDI library provides easy access to the data.

Looking up an indicator

The World Bank data that is available through this API provides many variable including iso2c country codes, names of the variables, and many more. The WDISearch() command allows strings to be searched for in these variables. In the code block below a search for maternal mortality ratio is made in the name variable.

WDIsearch(string = "maternal mortality ratio", 
          field = "name",
          short = FALSE)
##      indicator       
## [1,] "SH.STA.MMRT"   
## [2,] "SH.STA.MMRT.NE"
##      name                                                                   
## [1,] "Maternal mortality ratio (modeled estimate, per 100,000 live births)" 
## [2,] "Maternal mortality ratio (national estimate, per 100,000 live births)"
##      description                                                                                                                                                                                                                                                                                                                                                                                                      
## [1,] "Maternal mortality ratio is the number of women who die from pregnancy-related causes while pregnant or within 42 days of pregnancy termination per 100,000 live births. The data are estimated with a regression model using information on the proportion of maternal deaths among non-AIDS deaths in women ages 15-49, fertility, birth attendants, and GDP measured using purchasing power parities (PPPs)."
## [2,] "Maternal mortality ratio is the number of women who die from pregnancy-related causes while pregnant or within 42 days of pregnancy termination per 100,000 live births."                                                                                                                                                                                                                                       
##      sourceDatabase                
## [1,] "World Development Indicators"
## [2,] "World Development Indicators"
##      sourceOrganization                                                                                                                                                     
## [1,] "WHO, UNICEF, UNFPA, World Bank Group, and the United Nations Population Division. Trends in Maternal Mortality: 2000 to 2017. Geneva, World Health Organization, 2019"
## [2,] "UNICEF, State of the World's Children, Childinfo, and Demographic and Health Surveys."

Looking for the data

You could get more information via ?WDI

df <- data.frame(
  WDI(country= c("US","BR","ZA"), 
    indicator="SH.STA.MMRT",
    start=2005, end=2015,extra = FALSE))
head(df)
##   iso2c country SH.STA.MMRT year
## 1    BR  Brazil          63 2015
## 2    BR  Brazil          62 2014
## 3    BR  Brazil          61 2013
## 4    BR  Brazil          60 2012
## 5    BR  Brazil          61 2011
## 6    BR  Brazil          65 2010

the full list of variables is shown below

names(WDI(country = "US",
          indicator = "SH.STA.MMRT",
          start = 1990,
          end = 2020, 
          extra = TRUE))
##  [1] "iso2c"       "country"     "SH.STA.MMRT" "year"        "iso3c"      
##  [6] "region"      "capital"     "longitude"   "latitude"    "income"     
## [11] "lending"

Filtering data

we could use filter() to subset data from the dataset

brazil <- df %>% filter(iso2c == "BR")
usa <- df %>% filter(iso2c == "US")
za <- df %>% filter(iso2c == "ZA")

Plotting the data

the code block creates three traces for a single plot using plotly()

trace0 <- brazil$SH.STA.MMRT
trace1 <- usa$SH.STA.MMRT
trace2 <- za$SH.STA.MMRT
dts <- brazil$year
df.plot <- data.frame(dts,trace0,trace1, trace2)

p1 <- plot_ly(df.plot,
              x = ~dts,
              y = ~trace0,
              name = "Brazil",
              type = "scatter",
              mode = "lines+markers") %>% 
  add_trace(y = ~trace1,
            name = "USA",
            type = "scatter",
            mode = "lines+markers") %>% 
  add_trace(y = ~trace2,
            name = "RSA",
            type = "scatter",
            mode = "lines+markers")%>% 
  layout(title = "Maternal mortality per 100,000 births",
         xaxis = list(title = "Year",
                      zeroline = FALSE),
         yaxis = list(title = "Count",
                      zeroline = FALSE))
p1

Using ggplot to plot the data

the idea is to comparing if ggplot2 could also plot the data as plotly()

plot2 <- df %>% 
  ggplot(aes(x = year, y = SH.STA.MMRT, color = iso2c)) +
  geom_line()+
  geom_point()+
  scale_x_discrete(limits = c(2005:2015))+# change the x_scale 
  ggtitle("Maternal mortality per 100,000 births")+
  xlab("Year")+
  ylab("Count")
## Warning: Continuous limits supplied to discrete scale.
## Did you mean `limits = factor(...)` or `scale_*_continuous()`?
plot2

Conclusion

The WDI library is extremely easy to use, yet powerful enough to bring a massive open data resource to you desktop. Learn more about it at: https://cran.r-project.org/web/packages/WDI/WDI.pdf

If you would like to know more about maternal mortality rates in South Africa, you can view this issue of the South African Medical Journal: http://www.samj.org.za/index.php/samj/issue/view/215/showToc